50 research outputs found
A Unified Framework for Data-Driven Atomistic Modeling
With the advancements in computers, atomistic modeling has grown to be an important tool in many fields of science in recent decades. This work focuses on two important and popular classes of atomistic modeling methods: density functional theory (DFT) and molecular dynamics force fields (FF). They are considered together because they both follow a similar abstract formalism: both consist of a model space that is used to describe the local chemical environment, and a model that connects the model space to the local property of interest. Hence, they also face similar difficulties in their development. Neither of them have established strategies for systematically improving the quality of their model spaces, and both suffer from models that are not arbitrarily accurate. These are the challenges that this work is attempting to address.
In this dissertation, an innovative framework of describing local chemical environments, called the multipole (MP) descriptor family is proposed and examined. It possesses many desired properties such as being mathematically complete, systematically improvable, physically meaningful, and applicable to both electronic and atomic environments. As examples, three specific cases of the MP descriptor family, Heaviside step multipole (HSMP), Legendre polynomial multipole (LPMP) and Gaussian multipole (GMP) are formulated and used to build the model spaces for DFT and FF models. In addition, tools and protocols are developed for training models to connect the MP descriptors to energies for DFT and FFs. We focus on building data-driven machine learning (ML) models in this work, because ML has been shown to be a promising alternative to analytical models. These new tools include a near-uniform sampling algorithm and software packages for training and understanding the models. Also, an interactive visualization tool called Electrolens is developed to help scientists to explore and gain intuition about the model spaces and models of both DFT and FF.
Finally, based on the applications of the MP descriptors in electronic and atomic systems, a powerful framework called chemical environment modeling theory (CEMT) is proposed. CEMT offers a new perspective to generalize and unify the concepts of FF (atomistic and coarse-grained) and DFT, and shows that they are just two special cases of a continuum of methods. A data-driven approach to accelerate model training under this CEMT framework is also proposed. Together, the MP feature family and this new framework could help guide the next generation of data-driven atomistic modeling method
development.Ph.D
ElectroLens: Understanding Atomistic Simulations Through Spatially-resolved Visualization of High-dimensional Features
In recent years, machine learning (ML) has gained significant popularity in
the field of chemical informatics and electronic structure theory. These
techniques often require researchers to engineer abstract "features" that
encode chemical concepts into a mathematical form compatible with the input to
machine-learning models. However, there is no existing tool to connect these
abstract features back to the actual chemical system, making it difficult to
diagnose failures and to build intuition about the meaning of the features. We
present ElectroLens, a new visualization tool for high-dimensional
spatially-resolved features to tackle this problem. The tool visualizes
high-dimensional data sets for atomistic and electron environment features by a
series of linked 3D views and 2D plots. The tool is able to connect different
derived features and their corresponding regions in 3D via interactive
selection. It is built to be scalable, and integrate with existing
infrastructure.Comment: accepted to IEEE visualization 2019 conferenc
The Role of Reference Points in Machine-Learned Atomistic Simulation Models
This paper introduces the Chemical Environment Modeling Theory (CEMT), a
novel, generalized framework designed to overcome the limitations inherent in
traditional atom-centered Machine Learning Force Field (MLFF) models, widely
used in atomistic simulations of chemical systems. CEMT demonstrated enhanced
flexibility and adaptability by allowing reference points to exist anywhere
within the modeled domain and thus, enabling the study of various model
architectures. Utilizing Gaussian Multipole (GMP) featurization functions,
several models with different reference point sets, including finite difference
grid-centered and bond-centered models, were tested to analyze the variance in
capabilities intrinsic to models built on distinct reference points. The
results underscore the potential of non-atom-centered reference points in force
training, revealing variations in prediction accuracy, inference speed and
learning efficiency. Finally, a unique connection between CEMT and real-space
orbital-free finite element Density Functional Theory (FE-DFT) is established,
and the implications include the enhancement of data efficiency and robustness.
It allows the leveraging of spatially-resolved energy densities and charge
densities from FE-DFT calculations, as well as serving as a pivotal step
towards integrating known quantum-mechanical laws into the architecture of ML
models
Characteristics of enzymolysis of silkworm pupa protein after tri-frequency ultrasonic pretreatment: kinetics, thermodynamics, structure and antioxidant changes
As a by-product of the sericulture industry, the utilization rate of silkworm pupa resources is currently not high. Proteins are converted into bioactive peptides through enzymatic hydrolysis. Not only can it solve the utilization problem, but it also creates more valuable nutritional additives. Silkworm pupa protein (SPP) was pretreated with tri-frequency ultrasonic (22/28/40 kHz). Effects of ultrasonic pretreatment on enzymolysis kinetics, enzymolysis thermodynamics, hydrolysate structure as well as hydrolysate antioxidant of SPP were investigated. Ultrasonic pretreatment significantly increased the hydrolysis efficiency, showing a 6.369% decrease in km and a 16.746% increase in kA after ultrasonic action (p < 0.05). The SPP enzymolysis reaction followed a second-order rate kinetics model. Evaluation of enzymolysis thermodynamics revealed that Ultrasonic pretreatment markedly enhanced the SPP enzymolysis, leading to a 21.943% decrease in Ea. Besides, Ultrasonic pretreatment significantly increased SPP hydrolysate’s surface hydrophobicity, thermal stability, crystallinity, and antioxidant activities (DPPH radical scavenging activity, Fe2+ chelation ability, and reducing power). This study indicated that tri-frequency ultrasonic pretreatment could be an efficient approach to enhancing the enzymolysis and improving the functional properties of SPP. Therefore, tri-frequency ultrasound technology can be applied industrially to enhance enzyme reaction process
Fingerprinting and visualizing electronic environment
Presented on November 7, 2018 at 6:00 p.m. in the Georgia Tech Hotel and Conference Center, room 236.Xiangyun Lei is a graduate research assistant at the Georgia Institute of Technology with extensive experience in the processing and visualization of complex data.Runtime: 03:14 minute
Quantumness of ensemble via coherence of Gram matrix
The Gram matrix of a set of vectors, which encapsulates the relations between the constituent vectors, plays an important role in the exploration of both geometric and information-theoretic aspects of quantum state space. In view of its usefulness and importance, we study the Gram matrix of an ensemble of pure states (a set of pure states with a prior probability distribution) and reveal its fundamental properties. We highlight and exploit the fact that the Gram matrix of an ensemble can be formally regarded as a bona fide quantum state. The key idea is to employ coherence (with respect to the computational basis) of the Gram matrix (regarded as a quantum state) to quantify quantumness of the corresponding ensemble. In particular, we propose to use the l1-norm of coherence and the relative entropy of coherence, of the Gram matrix, as two significant quantifiers of quantumness. We illustrate the effectiveness and power of these quantifiers of quantumness by evaluating them for several important ensembles arising from quantum measurement and quantum cryptography. We further compare the quantumness based on the Gram matrix with various other quantifiers of quantumness in the literature, and show that although they are closely related in general, they have subtle differences and capture different aspects of ensembles, which shed light on the complexity of quantum ensembles